Polynomial regression model for duration prediction in Mandarin
نویسندگان
چکیده
Duration modeling is to establish a mapping relationship between the prosodic context and the segmental duration engendered in natural speech. In this paper, we first study the effect of prosodic features on segmental duration of neutral utterance in Mandarin by introducing a statistical concept--eta squared, then choose more forceful prosodic features and design interaction quantifying algorithm to study the interaction phenomenon among them, and finally determine the duration model using a polynomial and obtain the coefficients through nonlinear regression. Our research work indicates that 5 to 6 prosodic features might by and large assist a close and accurate mapping between prosodic context and the perceived duration. Compared to Wagon tree method, this one has undeniable merits.
منابع مشابه
Totally data-driven intonation prediction model using a novel F0 contour parametric representation
This paper proposes a novel parametric representation of mandarin intonation based on orthogonal polynomial approximation. The polynomial is a simplified representation of Parallel Encoding and Target Approximation (PENTA) intonation model that includes a target component and an approximation component. We also propose predicting the polynomial parameters from linguistic and phonetic attributes...
متن کاملA Unified Totally-Data-Driven Framework for Duration and Intonation Modeling
This paper proposes a unified framework for duration and intonation modeling in Mandarin TTS. In this framework, we design a novel parametric representation of mandarin intonation based on orthogonal polynomial approximation. By this representation, we can decompose F0 vector into 3 orthogonal polynomial parameters that are continuous scalars. Based on this vector-to-scalar decomposition, we ca...
متن کاملAn effective initial/final duration prediction method for corpus-based singing voice synthesis of Mandarin Chinese
In this paper, we propose an effective method for predicting initial/final duration for corpus-based singing voice synthesis of Mandarin Chinese. The goal of the method is to improve the naturalness and clarity of the synthesized singing voices. To achieve this goal, we construct an individual initial/final (I/F) duration prediction model for each category of consonants. Support vector machine ...
متن کاملPhrase break prediction using logistic generalized linear model
In this paper we propose a novel phrase break prediction model for Mandarin speech synthesis. It is generalized linear models (GLM) with stepwise regression solution. We assume phrase break obeys Bernoulli distribution and then model phrase break probability by Logistic GLM. The attribute set is automatically selected by stepwise regression, which is a totally data-driven method. We also introd...
متن کاملNonparametric Regression Estimation under Kernel Polynomial Model for Unstructured Data
The nonparametric estimation(NE) of kernel polynomial regression (KPR) model is a powerful tool to visually depict the effect of covariates on response variable, when there exist unstructured and heterogeneous data. In this paper we introduce KPR model that is the mixture of nonparametric regression models with bootstrap algorithm, which is considered in a heterogeneous and unstructured framewo...
متن کامل